Skip to content

snapshots: send highest manifest slot #5932

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 3 commits into
base: main
Choose a base branch
from

Conversation

cali-jumptrading
Copy link
Contributor

@cali-jumptrading cali-jumptrading commented Aug 5, 2025

Repair can start before the snapshot is fully loaded if the snapshot tiles send the highest manifest slot seen so far. The highest manifest slot is typically the incremental snapshot slot, but can be the full snapshot slot if incremental snapshots are disabled. The highest manifest slot is guaranteed to be monotonically increasing. It is sent by the snaprd tile when a snapshot peer is selected.

Makes use of #5777 to resolve snapshots for each peer.

@cali-jumptrading cali-jumptrading force-pushed the cali/snapshot-tiles-slot-message branch 7 times, most recently from 186fa5b to 3bdaafc Compare August 7, 2025 14:32
@cali-jumptrading cali-jumptrading changed the title Cali/snapshot tiles slot message wip snapshots: send highest manifest slot Aug 7, 2025
@cali-jumptrading cali-jumptrading force-pushed the cali/snapshot-tiles-slot-message branch from 3bdaafc to 950ec6b Compare August 7, 2025 14:40
@cali-jumptrading cali-jumptrading marked this pull request as ready for review August 7, 2025 14:45
@cali-jumptrading cali-jumptrading force-pushed the cali/snapshot-tiles-slot-message branch 4 times, most recently from aa7352d to 875936e Compare August 7, 2025 17:38
@cali-jumptrading cali-jumptrading force-pushed the cali/snapshot-tiles-slot-message branch from 875936e to 2673ae1 Compare August 8, 2025 20:16
@cali-jumptrading cali-jumptrading force-pushed the cali/snapshot-tiles-slot-message branch from 2673ae1 to 8af7620 Compare August 13, 2025 16:03
struct pollfd * pfd = &ssping->fds[ i ];
if( FD_UNLIKELY( pfd->revents & (POLLERR|POLLHUP) ) ) {
unping_peer( ssping, peer_pool_ele( ssping->pool, ssping->fds_idx[ i ] ), now );
continue;
}

if( FD_LIKELY( pfd->revents & POLLOUT ) ) {
struct icmphdr icmp_hdr = (struct icmphdr){
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The more I think about this the more I think I was wrong about using HTTP server ACK response time. Using ping time is actually probably correct for the pinger because we are just trying to find the server in the closest data center. Whether they had some latency spike on the HTTP server when we sent request probably doesn't matter too much

#define FD_SSMSG_DONE (2) /* Indicates the snapshot is fully loaded and tiles are shutting down */
#define FD_SSMSG_MANIFEST_FULL (0) /* A snapshot manifest message from the full snapshot */
#define FD_SSMSG_MANIFEST_INCREMENTAL (1) /* A snapshot manifest message from the incremental snapshot */
#define FD_SSMSG_HIGHEST_MANIFEST_SLOT (2) /* The highest manifest slot so far, guaranteed to be monotonically increasing */
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Comment doesn't seem right. Why would this be monotonically increasing?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should really be
EXPECTED_MANIFEST_SLOT

Copy link
Contributor

@mmcgee-jump mmcgee-jump left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants